AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Unsupervised Preference Optimization

# Unsupervised Preference Optimization

Mistral Orpo Beta
MIT
Mistral-ORPO-β is a 7B-parameter language model fine-tuned using the ORPO method based on Mistral-7B, capable of directly learning preferences without a supervised fine-tuning warm-up phase.
Large Language Model Transformers English
M
kaist-ai
18
38
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase